AITopics | human visual system

Generalisation in humans and deep neural networks

Neural Information Processing SystemsMar-16-2026, 17:58:00 GMT

We compare the robustness of humans and current convolutional deep neural networks (DNNs) on object recognition under twelve different types of image degradations. First, using three well known DNNs (ResNet-152, VGG-19, GoogLeNet) we find the human visual system to be more robust to nearly all of the tested image manipulations, and we observe progressively diverging classification error-patterns between humans and DNNs when the signal gets weaker. Secondly, we show that DNNs trained directly on distorted images consistently surpass human performance on the exact distortion types they were trained on, yet they display extremely poor generalisation abilities when tested on other distortion types. For example, training on salt-and-pepper noise does not imply robustness on uniform white noise and vice versa. Thus, changes in the noise distribution between training and testing constitutes a crucial challenge to deep learning vision systems that can be systematically addressed in a lifelong machine learning approach. Our new dataset consisting of 83K carefully measured human psychophysical trials provide a useful reference for lifelong robustness against image degradations set by the human visual system.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Emergence of Shape Bias in Convolutional Neural Networks through Activation Sparsity

Neural Information Processing SystemsDec-27-2025, 00:44:47 GMT

Current deep-learning models for object recognition are known to be heavily biased toward texture. In contrast, human visual systems are known to be biased toward shape and structure. What could be the design principles in human visual systems that led to this difference? How could we introduce more shape bias into the deep learning models? In this paper, we report that sparse coding, a ubiquitous principle in the brain, can in itself introduce shape bias into the network.

convolutional neural network, name change, shape bias, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Generalisation in humans and deep neural networks

Neural Information Processing SystemsNov-20-2025, 21:43:19 GMT

We compare the robustness of humans and current convolutional deep neural networks (DNNs) on object recognition under twelve different types of image degradations. First, using three well known DNNs (ResNet-152, VGG-19, GoogLeNet) we find the human visual system to be more robust to nearly all of the tested image manipulations, and we observe progressively diverging classification error-patterns between humans and DNNs when the signal gets weaker. Secondly, we show that DNNs trained directly on distorted images consistently surpass human performance on the exact distortion types they were trained on, yet they display extremely poor generalisation abilities when tested on other distortion types. For example, training on salt-and-pepper noise does not imply robustness on uniform white noise and vice versa. Thus, changes in the noise distribution between training and testing constitutes a crucial challenge to deep learning vision systems that can be systematically addressed in a lifelong machine learning approach. Our new dataset consisting of 83K carefully measured human psychophysical trials provide a useful reference for lifelong robustness against image degradations set by the human visual system.

deep neural network, generalisation, name change, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning visual biases from human imagination

Carl Vondrick, Hamed Pirsiavash, Aude Oliva, Antonio Torralba

Neural Information Processing SystemsOct-2-2025, 10:56:45 GMT

Although the human visual system can recognize many concepts under challenging conditions, it still has some biases. In this paper, we investigate whether we can extract these biases and transfer them into a machine recognition system. We introduce a novel method that, inspired by well-known tools in human psychophysics, estimates the biases that the human visual system might use for recognition, but in computer vision feature spaces. Our experiments are surprising, and suggest that classifiers from the human visual system can be transferred into a machine with some success. Since these classifiers seem to capture favorable biases in the human visual system, we further present an SVM formulation that constrains the orientation of the SVM hyperplane to agree with the bias from human visual system. Our results suggest that transferring this human bias into machines may help object recognition systems generalize across datasets and perform better when very little training data is available.

artificial intelligence, machine learning, template, (17 more...)

Neural Information Processing Systems

Country:

Asia > India (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Maryland > Baltimore County (0.04)
North America > United States > Maryland > Baltimore (0.04)

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Emergence of Shape Bias in Convolutional Neural Networks through Activation Sparsity

Neural Information Processing SystemsJan-20-2025, 00:48:08 GMT

Current deep-learning models for object recognition are known to be heavily biased toward texture. In contrast, human visual systems are known to be biased toward shape and structure. What could be the design principles in human visual systems that led to this difference? How could we introduce more shape bias into the deep learning models? In this paper, we report that sparse coding, a ubiquitous principle in the brain, can in itself introduce shape bias into the network.

activation sparsity, convolutional neural network, shape bias, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning visual biases from human imagination

Neural Information Processing SystemsOct-11-2024, 16:13:30 GMT

Although the human visual system can recognize many concepts under challengingconditions, it still has some biases. In this paper, we investigate whether wecan extract these biases and transfer them into a machine recognition system.We introduce a novel method that, inspired by well-known tools in humanpsychophysics, estimates the biases that the human visual system might use forrecognition, but in computer vision feature spaces. Our experiments aresurprising, and suggest that classifiers from the human visual system can betransferred into a machine with some success. Since these classifiers seem tocapture favorable biases in the human visual system, we further present an SVMformulation that constrains the orientation of the SVM hyperplane to agree withthe bias from human visual system. Our results suggest that transferring thishuman bias into machines may help object recognition systems generalize acrossdatasets and perform better when very little training data is available.

human imagination, human visual system, learning visual bias, (1 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.86)
Information Technology > Artificial Intelligence > Vision (0.66)

Add feedback

Interpretable Action Recognition on Hard to Classify Actions

Anichenko, Anastasia, Guerin, Frank, Gilbert, Andrew

arXiv.org Artificial IntelligenceSep-19-2024

We investigate a human-like interpretable model of video understanding. Humans recognise complex activities in video by recognising critical spatio-temporal relations among explicitly recognised objects and parts, for example, an object entering the aperture of a container. To mimic this we build on a model which uses positions of objects and hands, and their motions, to recognise the activity taking place. To improve this model we focussed on three of the most confused classes (for this model) and identified that the lack of 3D information was the major problem. To address this we extended our basic model by adding 3D awareness in two ways: (1) A state-of-the-art object detection model was fine-tuned to determine the difference between "Container" and "NotContainer" in order to integrate object shape information into the existing object features. (2) A state-of-the-art depth estimation model was used to extract depth values for individual objects and calculate depth relations to expand the existing relations used our interpretable model. These 3D extensions to our basic model were evaluated on a subset of three superficially similar "Putting" actions from the Something-Something-v2 dataset. The results showed that the container detector did not improve performance, but the addition of depth relations made a significant improvement to performance.

interpretable action recognition, relation, video, (12 more...)

arXiv.org Artificial Intelligence

2409.13091

Country: Europe > United Kingdom > England > Surrey > Guildford (0.04)

Genre: Research Report (0.91)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.32)

Add feedback

Learning visual biases from human imagination

Neural Information Processing SystemsMar-13-2024, 01:29:22 GMT

Although the human visual system can recognize many concepts under challenging conditions, it still has some biases. In this paper, we investigate whether we can extract these biases and transfer them into a machine recognition system. We introduce a novel method that, inspired by well-known tools in human psychophysics, estimates the biases that the human visual system might use for recognition, but in computer vision feature spaces. Our experiments are surprising, and suggest that classifiers from the human visual system can be transferred into a machine with some success. Since these classifiers seem to capture favorable biases in the human visual system, we further present an SVM formulation that constrains the orientation of the SVM hyperplane to agree with the bias from human visual system. Our results suggest that transferring this human bias into machines may help object recognition systems generalize across datasets and perform better when very little training data is available.

human visual system, noise, template, (15 more...)

Neural Information Processing Systems

Country:

Asia > India (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Maryland > Baltimore County (0.04)
North America > United States > Maryland > Baltimore (0.04)

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Hierarchical Representations for Spatio-Temporal Visual Attention Modeling and Understanding

Fernández-Torres, Miguel-Ángel

arXiv.org Artificial IntelligenceAug-9-2023

Thesis concerns the study and development of hierarchical representations for spatio-temporal visual attention modeling and understanding in video sequences. More specifically, we propose two computational models for visual attention. First, we present a generative probabilistic model for context-aware visual attention modeling and understanding. Secondly, we develop a deep network architecture for visual attention modeling, which first estimates top-down spatio-temporal visual attention, and ultimately serves for modeling attention in the temporal domain. The first part of the thesis introduces our first proposal: a generative probabilistic framework for spatio-temporal visual attention modeling and understanding.

artificial intelligence, machine learning, natural language, (23 more...)

arXiv.org Artificial Intelligence

2308.05189

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.13)
Europe > Spain > Galicia > Madrid (0.04)
Europe > United Kingdom > England > Greater London > London > Wimbledon (0.04)
(19 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Media > Film (1.00)
Leisure & Entertainment > Sports (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(6 more...)

Add feedback

A self-organizing multiple-view representation of 3D objects

Neural Information Processing SystemsApr-6-2023, 19:46:45 GMT

The form in which these models are best stored depends on the kind of information available in the input, and on the trade-off between the amount of memory allocated for the storage and the degree of sophistication required of the recognition process. In computer vision, a distinction can be made between representation schemes that use 3D object-centered coordinate systems and schemes that store viewpoint-specific information such as 2D views of objects.

cid, representation, self-organizing multiple-view representation, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.69)

Add feedback

Filters

Collaborating Authors

human visual system

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Generalisation in humans and deep neural networks

Emergence of Shape Bias in Convolutional Neural Networks through Activation Sparsity

Generalisation in humans and deep neural networks

Learning visual biases from human imagination

Emergence of Shape Bias in Convolutional Neural Networks through Activation Sparsity

Learning visual biases from human imagination

Interpretable Action Recognition on Hard to Classify Actions

Learning visual biases from human imagination

Hierarchical Representations for Spatio-Temporal Visual Attention Modeling and Understanding

A self-organizing multiple-view representation of 3D objects